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Jatropha  curcas  is  believed  to  be  one  of  the  potential  biofuel  crops,  as  it  does  not  compete  with  planting 
lands  for  the  edible  oil  plants.  However,  J.  curcas  has  not  been  domesticated  for  producing  biodiesel. 
Conventional  breeding  to  increase  the  productivity  of  J.  curcas  has  started  since  the  early  2000s.  Although 
some  genetic  improvement  of  oil  yield  has  been  made  through  conventional  breeding,  oil  yield  is 
currently  still  too  low  (<2000  kg/ha/year)  to  make  the  biodiesel  production  from  J.  curcas  sustainable. 
Due  to  the  enormous  potential  of  marker-assisted  selection  (MAS)  and  genomic  selection  (GS)  to  speed 
up  genetic  gain  through  early  selection,  genomic  resources  such  as  DNA  markers,  a  linkage  map, 
transcriptome  sequences  and  a  draft  genome,  have  been  developed  and  some  are  being  used  in  genetic 
improvement  for  sustainable  production  of  biodiesel.  In  this  review,  we  present  the  recent  advances  in 
conventional  breeding,  as  well  as  development  and  applications  of  genomic  resources  to  improve  the 
quantity  and  quality  of  biodiesel  extracted  from  seeds  of  J.  curcas.  We  also  highlighted  the  requirement  of 
a  well-assembled  reference  genome  of  J.  curcas  and  the  potentials  of  next  generation  sequencing  (NGS) 
for  genome-wide  association  studies  (GWAS)  and  GS  to  speed  up  the  increase  of  the  yield  and  quality  of 
biodiesel  from  J.  curcas. 
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1.  Introduction 

Fossil  fuel  reserves  are  limited,  while  demand  is  ever-increasing 
worldwide  [1],  Combustible  fuels  are  the  world's  main  energy 
resource  and  are  at  the  center  of  global  energy  demands  [2],  People 
are  increasingly  concerned  about  climate  change  [3],  the  dwindling 
supply  of  fossil  fuel,  as  well  as  its  unstable  and  rising  costs,  which 
has  motivated  researchers  to  seek  alternative,  renewable  energy 
sources  [3-5].  Biofuels  are  one  of  the  solutions  to  energy  security, 
the  reduction  of  emissions  of  greenhouse  gas  and  sustainable 
development  [6].  Biodiesel  has  received  considerable  worldwide 
attention  in  the  past  years  as  it  is  environmental  friendly  [7]. 
However,  many  countries  (e.g.  China  and  Japan)  do  not  allow  the 
use  of  edible  oils  (e.g.  soybean,  palm  and  rapeseed  oils)  to  produce 
biodiesel  to  ensure  food  security  [8].  Therefore,  alternative  plant 
sources  for  non-edible  oil  for  use  in  production  of  biodiesel  have 
been  extensively  sought  after  [9], 

The  plant  Jatropha  curcas  Linnaeus  originated  from  Mexico  and 
is  an  underutilized  oil-bearing  crop  [10],  It  was  brought  to  Asia 
and  Africa  by  Portuguese  traders  350  years  ago  [10],  Its  seeds  can 
be  processed  into  biodiesel  and  it  is  believed  that  J.  curcas  can 
grow  on  poor  soils  and  areas  of  low  rainfall  (from  250  mm  a  year), 
hence,  it  has  been  promoted  as  the  ideal  plant  for  small  farmers  in 
countries  such  as  India  [11],  China  [12]  Indonesia  [13]  and  Africa 
[14].  However,  /  curcas  had  never  been  domesticated  for  produ¬ 
cing  biodiesel  before  recent  years  [15].  Since  2008,  several  coun¬ 
tries  have  started  breeding  programs  to  improve  seed  yield  [16- 
19].  According  to  published  reports,  each  mature  tree  produces  an 
average  of  4  kg  of  seeds  per  year  when  cultivated  under  optimal 
conditions  [12,13,15].  Its  oil  yield  is  still  much  lower  in  comparison 
to  other  oil  producing  plant  species,  such  as  oil  palm,  which  is 
the  main  bottleneck  in  plantation  of  J.  curcas  for  production  of 
biodiesel  [20],  Besides  seed  yield,  other  traits  such  as  the  number 
of  female  flowers,  later  maturity,  resistance  to  lodging,  resistance 
to  pest  and  disease,  reduced  plant  height  and  high  natural 
ramification  of  branches  are  also  important  for  improving  oil  yield 
[21-23].  However,  the  genetic  improvement  for  oil  production 
with  traditional  breeding  is  very  slow  and  tedious  as  phenotypes 
can  only  be  measured  after  they  are  expressed. 

Molecular  breeding,  also  called  marker-assisted  selection 
(MAS),  refers  to  the  procedure  of  the  use  of  DNA  markers  which 
are  tightly  linked  to  traits  to  assist  phenotypic  selection  [24].  In 
comparison  to  traditional  breeding,  molecular  breeding  possesses 
several  advantages  such  as  selection  at  seedling  stage,  no  influence 
of  environment,  and  selection  of  preferred  homozygotes,  thus 
accelerating  the  genetic  improvement.  With  the  rapid  develop¬ 
ment  of  next-generation  sequencing  (NGS)  technologies,  it  is  now 
easy  to  detect  and  characterize  a  large  number  of  DNA  markers 
using  NGS  and  polymerase  chain  reaction  (PCR)  [24].  Molecular 
breeding  has  already  been  applied  in  important  agronomic  species 
to  speed  up  genetic  improvement,  such  as  in  rice,  maize  and  corn 
[24],  In  jatropha,  molecular  breeding  is  still  in  its  infancy  [25], 
although  some  reports  on  DNA  markers  [26,27],  linkage  map  [28] 
and  QTL  mapping  for  seed  yield  [25,28,29]  have  been  published. 

Several  important  issues  on  jatropha  biodiesel  concerning 
plantation,  tissue  culture,  biotechnological  and  biochemical  engineer¬ 
ing,  biodiesel  production  and  applications,  economy  and  policy  have 


already  been  reviewed  [11-13,15,30].  However,  this  review  is  different 
from  these  excellent  reviews,  and  combines  relevant  information 
about  the  recent  advances  of  the  development  of  genomic  resources 
and  their  applications  in  accelerating  genetic  improvement  of  J.  curcas 
for  enhancing  quantity  and  quality  of  biodiesel  extracted  from  seeds 
of  J.  curcas.  We  also  discussed  the  potentials  of  genome-wide 
association  studies  (GWAS)  and  genomic  selection  (GS)  for  speeding 
up  the  increase  of  the  yield  and  quality  of  biodiesel  from  J.  curcas. 


2.  Conventional  breeding  for  increasing  oil  yield 

2.1.  Plantation  and  phenotypic  variations 

Jatropha  curcas  L.  belonging  to  the  Euphorbiaceae  family  is  a 
perennial  crop.  Its  seeds  contain  up  to  35%  oil  [10],  J.  curcas  is 
traditionally  used  as  a  hedge  plant  and  various  parts  of  the  tree 
have  been  collected  for  medicinal  uses.  However,/,  curcas  was  not 
domesticated  and  extensively  selected  for  oil  yields  before  the 
2000s  [31],  As  a  result,/,  curcas  currently  is  still  a  wild  plant  with 
low  oil  yields.  The  oil  yield  of  wild  /.  curcas  is  less  than  1000  kg/ha/ 
year  [12,13],  much  lower  than  some  major  oil  crops  such  as  oil 
palm,  coconut  oil  and  rapeseed  [32].  We  have  summarized  the 
annual  oil  yield  of  major  oil-producing  plants  in  Fig.  1.  Due  to  the 
realization  of  its  potential  for  producing  biodiesel  to  decrease  the 
oil  crisis,  reduce  pressure  on  the  environment  and  control  urban 
air  pollution;  and  many  claims  about  its  advantages,  /.  curcas  is 
believed  to  be  an  ideal  plant  for  producing  biofuel  [33,34].  Thus, 
the  plantation  of J.  curcas  moved  from  small  scale  to  large  scale  in 
India,  China,  Malaysia,  Indonesia,  Philippines,  Burma,  Saudi  Arabia, 
Ghana,  South  Africa,  Senegal,  Nigeria,  Tanzania,  Ethiopia,  Zaxmbia 
and  Zimbabwe  and  other  countries.  In  2008,  the  total  plantation 
area  was  900,000  ha  globally,  among  which  84.4%  (760,000  ha) 
was  in  Asia,  13.3%  (120,000  ha)  in  Africa  and  2.23%  (20,000  ha)  in 
Latin  America.  According  to  a  recent  report  of  FAO  [35],  the  total 
plantation  area  of  /.  curcas  is  expected  to  be  12.8  million  ha  by 
2015.  The  largest  producing  country  will  be  Indonesia  in  Asia, 
Ghana  and  Madagascar  in  Africa  and  Brazil  in  Latin  America  [35]. 


Oil  yield  (Kg/ha/year) 


Fig.  1.  Annual  oil  yield  of  major  oil-producing  plants.  Data  were  extracted  from 
several  sources  [32,44,45]. 
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Substantial  phenotypic  variations  in  seed  yield,  tree  height, 
branch  number,  flowering  time,  and  female  to  male  flower  ratio 
were  observed  [9,11,12],  Seed  yield  varied  from  300  to  7500  kg/ha/ 
year  [9,11,12],  However,  after  a  few  years  of  large  scale  plantation, 
people  realized  that  J.  curcas  was  not  a  wonder  plant  for  sustain¬ 
able  biodiesel  production  on  marginal  land  due  to  its  very  low  oil 
yield  (1500  kg/ha/year  for  improved  varieties,  see  Figs.  1  and  2), 
and  concluded  that  plantation  of  J.  curcas  on  marginal  land  could 
only  get  marginal  yield  [20,36],  Even  on  normal  fertile  soils.  J. 
curcas  [11,22,37]  was  no  match  for  other  major  oil  producing 
crops  [32]  (see  Fig.  1).  On  one  hand,  these  data  may  reflect  the 
difficulties  in  making  the  plantation  of  J.  curcas  profitable  and 
sustainable,  and  on  the  other  hand,  highlight  the  potential  for 
genetic  improvement  through  breeding. 

2.2.  Conventional  breeding  and  its  achievements 

Conventional  breeding  for  genetic  improvement  of  J.  curcas  has 
been  started  since  the  early  2000s  in  India,  China,  Thailand, 
Philippines,  Mexico,  Guatemala,  and  Brazil  [9,11,12].  Although  sev¬ 
eral  traits,  such  as  seed  yield,  oil  contents,  female  to  male  flower 
ratio,  synchronistic  of  flowering  and  fruiting,  branch  number  and  oil 
quality  are  important  for  the  genetic  improvement  for  producing 
biodiesel,  increasing  the  oil  yield  is  the  priority  [21,35],  In  the  past 
few  years,  many  researches  focused  on  collection  of  germplasm  and 
selective  breeding.  Montes  et  al.  undertook  evaluation  trials  in  J. 
curcas  involving  225  lines  from  Asia,  Africa  and  Latin  America  to 
study  degree  of  variability  [38].  Their  study  revealed  low  genetic 
variability  in  African  and  Indian  accessions  and  high  genetic  varia¬ 
bility  in  Guatemala  and  Latin  American  lines.  Evaluation  of  J.  curcas 
for  phenotypic  and  genotypic  variations  by  Basha  and  Sujatha  [39] 
showed  similar  results,  indicating  a  lower  genetic  variation  in  J. 
curcas.  Selective  breeding  is  the  core  activity  of  genetic  improvement 
for  oil  production  [40].  Mass  selection,  recurrent  selection,  hybrid 
breeding  and  induced  mutation  breeding  were  applied  to  improve 
trait  performances.  Details  about  these  methods  for  genetic 
improvement  can  be  found  in  the  review  by  Divakaran  et  al.  [21  ]. 

In  India,  mass  breeding  started  in  the  early  2000s.  The  National 
Oilseeds  and  Vegetable  Oils  Development  Board,  India  collected  over 


Oil  yield  (Kg/ha/year) 


Fig.  2.  Annual  oil  yield  of  improved  varieties  of  jatropha.  Data  were  extracted  from 
several  sources  [10,31,35,44,45,48], 


5000  accessions  with  a  network  of  40  institutions  and  identified 
1855  candidate  plus  trees.  Department  of  Biotechnology,  India, 
collected  1500  accessions  [41].  Kaushik  et  al.  analyzed  oil  content 
and  kernel  seed  coat  ratio  for  1000  samples  of  seeds  from  12  states 
in  India,  and  found  that  the  collection  from  Uttaranchal  had  the 
highest  percentage  (73%)  of  high  oil  yielding  plants  [42].  Most  of 
the  J.  curcas  varieties  were  developed  from  selections  made  in  the 
natural  populations  [10,43].  The  first  variety,  SDAUJ  I  (Chatrapati), 
was  identified  as  the  best  among  496  seed  sources  [44]  for 
commercial  cultivation  in  the  semi-arid  and  arid  regions  of  Gujarat 
and  Rajasthan  in  India.  According  to  a  recent  review  by  Pandey  [31], 
the  dry  seed  yield  of  J.  curcas  in  India  is  still  too  low  (  <  6  t/ha)  to  be 
profitable,  and  earlier  claims  of  high  seed  yield  could  not  be  proved 
by  serious  studies. 

In  China,  around  100  accessions  were  selected  for  further 
examinations  after  examination  of  over  800  [12].  The  annual  yields 
of  10  varieties  of  J.  curcas  were  higher  than  2.5  kg/tree  and  kernel 
oil  content  was  >  65%.  After  4-years  of  plantation  trial  from  six 
different  sources,  Yang  et  al.  selected  one  with  higher  yield.  The  oil 
yield  of  the  selected  variety  was  1566  kg/ha,  which  is  more  than 
five  times  the  national  best  yield  [45].  Over  100  copies  of  elite 
germplasm  were  marked  using  inter-simple  sequence  repeat  (ISSR) 
[12,46],  More  than  500  copies  of  materials  were  mutated  with 
chemical  mutagenesis,  Co60  radiation  and  space  carrying  technol¬ 
ogies  [12].  Ten  good  seed  sources,  one  new  variety  and  10  mutants 
were  selected  and  identified  in  the  field  based  on  the  biological 
characteristics,  economic  traits,  resistant  characteristics  and  other 
indicators  [12],  In  China,  variety  breeding  for  J.  curcas  is  lagging 
behind  its  large  scale  plantation  [12].  The  lack  of  high  yield  varieties 
is  one  of  the  main  hurdles  for  developing  the  industry  of  J.  curcas  in 
China  [45,47], 

In  Africa  and  South  America,  mass  breeding  has  been  conducted 
in  several  countries  in  co-operation  with  some  European  countries. 
According  to  a  recent  presentation  of  Dr  Van  Loo  (personal  commu¬ 
nication)  and  other  reports  [35],  the  best  oil  yield  was  1500  kg/ha/ 
year.  In  addition  to  improving  the  oil  yield,  Zimbabwe  has  developed 
non-toxic  varieties  of  J.  curcas,  which  would  make  the  seed  cake 
following  oil  extraction  suitable  as  animal  feed  without  its  detox¬ 
ification  (http://precedings.nature.eom/documents/2658/version/l). 
However,  its  oil  yield  is  not  yet  known. 

The  Singapore  Company  JOil  (S)  Pte  in  co-operation  with  the 
Temasek  Life  Sciences  Laboratory  has  started  breeding  J.  curcas 
since  2006.  Field  tests  showed  that  the  seed  yield  of  their  selected 
varieties  already  reached  at  2.4  t/ha  on  poor  land  in  India  in  the 
first  year.  Small  scale  tests  on  selected  elite  varieties  showed  that, 
under  good  conditions,  the  oil  yield  could  reach  2000  kg/ha/year 
(http://www.joil.com.sg/Latest-News-Archive).  The  British  com¬ 
pany,  BP-D1,  reported  that  the  oil  yield  of  several  elite  varieties 
were  as  high  as  2000  kg/ha/year  under  good  management  [48]. 
We  have  summarized  the  expected  and  the  realized  highest  oil 
yield  of  J.  curcas  reported  in  scientific  journals  and  conferences  in 
Fig.  2.  Based  on  scientific  reports,  we  estimated  that  the  maximum 
oil  yield  of  J.  curcas  could  be  4  t/ha/year. 

Although  a  number  of  institutes  are  involved  in  breeding, 
it  seems  that  the  results  of  breeding  have  not  been  published.  On 
the  other  hand,  a  number  of  commercial  companies  (e.g.  BP-D1) 
have  already  released  their  breeding  results  on  their  websites, 
which  seem  to  be  very  promising.  However,  these  data  need  to  be 
further  proved  by  large-scale  field  examination  by  other  parties. 
Although  the  conventional  breeding  of  J.  curcas  for  genetic  improve¬ 
ment  has  already  increased  the  yield  of  oil,  the  improvement  is  very 
slow.  About  5-7  years  are  required  to  obtain  improved  cultivars 
through  conventional  breeding.  Furthermore,  oil  quality  is  very 
difficult  to  be  improved  through  conventional  breeding  approaches. 
MAS  is  expected  to  accelerate  the  genetic  improvement  not  only  for 
seed  yield,  but  also  for  traits  which  are  difficult  to  select  for,  such  as 
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oil  quality  and  disease  resistance  [49].  For  MAS,  DNA  markers 
closely  linked  to  important  traits  must  be  identified.  DNA  markers 
must  be  cloned,  characterized  and  mapped  to  the  whole  genome  in 
order  to  identify  these  associations  [50,51]. 

3.  Genomic  resources  for  speeding  up  the  increase  of  oil  yield 
and  quality 

Genomic  resources  such  as  molecular  markers,  linkage  maps, 
ESTs  and  genome  sequences  are  powerful  tools  to  speed  up  genetic 
improvement  for  oil  yield  and  quality  through  MAS  or  GS  [52,53]. 
The  major  institutions  working  on  molecular  breeding  of  J.  curcas 
can  be  found  in  Table  1.  Availability  of  selected  genomic  resources  in 
J.  curcas  is  summarized  below. 

3.1.  DNA  markers 

In  a  genome,  most  of  the  DNA  sequences  are  conserved  among 
individuals,  while  a  small  proportion  is  variable.  DNA  markers  are 
the  variable  DNA  sequences  in  a  genome  that  can  be  differentiated 
using  molecular  or  biochemical  methods.  Among  all  genomic 
resources,  DNA  markers  have  direct  use  for  germplasm  character¬ 
ization,  linkage  and  QTL  mapping  and  molecular  breeding  [54]. 
Currently,  although  several  old  types  of  DNA  markers  such  as 
random  amplified  polymorphic  DNA  (RAPD),  amplified  fragment 


length  polymorphism  (AFLP)  and  inter-simple  sequence  repeat 
(ISSR)  markers  have  been  used  in  studying  genetic  variations  in 
natural  and  cultured  populations  of  J.  curcas  [55-57],  two  types  of 
DNA  markers  (i.e.  microsatellites  [58]  and  single  nucleotide  poly¬ 
morphism  (SNP)  [59])  are  the  most  preferred  in  linkage  mapping 
and  studies  on  genetic  diversity  for  agronomic  species.  This  is 
mainly  because  microsatellites  and  SNPs  are  highly  abundant  and 
easy  for  cost-effective  and  high  throughput  scoring.  More  recently, 
a  new  type  of  DNA  polymorphism,  copy  number  variation  (CNV), 
has  been  reported  in  humans  and  model  organisms  [60],  The 
application  of  CNV  in  agronomic  species  just  came  into  sight 
recently  [61  j.  However,  in  J.  curcas,  no  CNV  has  been  reported. 

RAPD,  ISSR  and  AFLP  have  been  used  in  analyzing  genetic 
variations  of  wild  and  cultured  varieties  of  J.  curcas  and  their 
relationships  [16,39,46,55,56,62-68].  The  general  finding  is  that 
the  genetic  diversity  is  very  low  in  J.  curcas. 

Microsatellites  have  already  been  identified  and  used  in  J. 
curcas.  In  2008,  our  lab  presented  over  300  microsatellites  at  an 
international  conference  on  J.  curcas  in  Singapore  [28].  Other 
laboratories  have  also  identified  a  few  microsatellites  [27,69], 
Microsatellites  were  used  increasingly  in  J.  curcas  [28,56,69-72] 
in  the  evaluation  of  germplasma  and  genome  mapping.  Currently, 
due  to  the  advent  of  NGS  technologies  [73],  resequencing  of  a 
genome  of  3  giga  bases  costs  less  than  1000  USD  using  the 
Illumina's  Hiseq  2000.  Using  bioinformatic  software  (see  review 
[74]),  microsatellites  can  be  easily  detected  in  genome  sequences. 


Table  1 

Major  institutions  involved  in  molecular  breeding  of  Jatropha  curcas  worldwide. 


Country 

Institution 

Major  activities 

References 

Singapore 

Temasek  Life  Sciences  Lab 

Microsatellites/SSR,  SNP,  genes,  linkage  and  QJL  mapping,  sequencing 
transcriptome  and  genome,  MAS,  transgenic  jatropha,  tissue  culture 

[25,28,29,82,87,102,110,111  ] 

Joil  Pte 

AFLP,  sequencing  genome,  MAS,  transgenic  jatropha,  tissue  culture 

[112] 

India 

Biotech  Park 

DNA  markers,  ESTs,  genes, 

[65,112] 

Osmania  University 

AFLP 

[65] 

Central  Salt  &  Marine  Chemical  Research  Institute 

RAPD,  AFLP,  SSR 

[70,71,113] 

Tamil  Nadu  Agricultural  University 

RAPD,  ISSR,  ESTs,  genes, 

[114-116] 

SRM  University 

ESTs,  gene, 

[117,118] 

Dhirubhai  Ambani  Life  Sciences  Center 

ESTs 

[119] 

China 

Xishuangbanna  Tropical  Botanical  Garden,  Chinese 
Academy  of  Sciences 

ESTs,  genes,  ISSR,  RAPD 

[16,79] 

South  China  Agricultural  University 

ESTs,  genes,  SSRs,  AFLP 

[16,79,120] 

South  China  Botanical  Garden,  Chinese  Academy  of 
Sciences 

SSR,  RAPD,  AFLP,  genes,  transgenic  plants 

[56,101,121] 

Sichuan  University 

ESTs,  genes,  ISSR, 

[122-124] 

Institute  of  Tropical  Biosciences  and  Biotechnology, 
Chinese  Academy  of  Tropical  Agricultural  Sciences 

BAC  library,  ESTs, SSR 

[26,125] 

Japan 

Kazusa  DNA  Research  Institute 

Genome  sequencing 

[89] 

Thailand 

Annamalai  University 

Mutagenesis,  microsatellites,  ISSR 

[126,127] 

RAPD 

[127] 

Kasetsart  University 

ISSR,  microsatellites,  genes 

[126,128,129] 

Philippines 

University  of  the  Philippines  Los  Banos 

Genetic  variation 

[130] 

Brazil 

Universidade  Estadual  de  Santa  Cruz  (UESC), 

ESTs,  genes, 

[131] 

Universidade  Catolica  de  Brasilia-SGAN 

RAPD 

[132] 

State  University  of  Campinas,  UNICAMP 

ESTs,  genes 

[133] 

Indonesia 

Indonesian  Center  for  Agricultural  Biotechnology 
and  Genetic  Resources  Research  and  Development 

RAPD 

[134] 

Malaysia 

University  Putra  Malaysia 

ISSR 

[135] 

The 

Netherlands 

Plant  Research  International  B.V. 

Molecular  genetics 

[136] 

USA 

SG  Biofuels 

SSRs,  SNP,  genome  sequencing 

[137] 

UK 

BP-D1 

DNA  markers 

[48] 

Africa 

Biotechnology  Laboratory,  Kenya  Forestry  Research 
Institute 

RAPD 

[62] 
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The  draft  genome  sequence  of  J.  curcas  [75]  is  already  available; 
microsatellites  can  be  easily  identified,  which  could  save  a  sub¬ 
stantial  amount  of  money  and  time  for  development  of  DNA 
markers. 

SNPs  have  recently  been  identified.  By  sequencing  pooled 
samples,  Silva-Junior  et  al.  identified  a  total  of  18,225  SNPs  in 
11.9  giga  bases,  suggesting  extremely  low  frequency  of  SNPs  in  J. 
curcas  [76],  Recently,  Gupta  discovered  2482  informative  SNPs  by 
sequencing  148  global  collections  of  J.  curcas  lines  and  found  that  a 
narrow  level  of  genetic  diversity  existed  among  the  indigenous 
genotypes  as  compared  to  the  exotic  genotypes  of  J.  curcas  [77],  Our 
group  developed  some  SNPs  in  ESTs  and  used  them  in  constructing  a 
linkage  map  of  jatropha  [28].  Although  some  SNP  markers  are  now 
available  J.  curcas  and  could  be  very  useful  in  molecular  breeding 
for  substantial  improvement  of  biodiesel  yield  and  quality,  no  cost- 
effective  and  high  throughput  genotyping  platforms  were  developed 
for  J.  curcas. 

3.2.  A  linkage  map 

A  linkage  map  is  the  essential  framework  for  genome-wide 
identification  of  associations  between  DNA  markers  and  traits  [51]. 
One  or  several  segregating  populations  where  DNA  markers  segregate 
are  required  to  construct  a  linkage  map.  Population  sizes  varied  from 
dozens  to  a  few  hundred  individuals.  For  high-resolution  mapping,  a 
large  number  of  individuals  ( >  200  individuals)  are  required.  In  J. 
curcas,  due  to  the  very  low  DNA  variation,  it  is  difficult  to  construct  a 
highly  informative  reference  family  for  linkage  mapping  [28],  There¬ 
fore,  families  generated  by  interspecies  crosses  are  the  better  choice. 
According  to  previous  work  on  inter-species  hybridization,  J.  curcas 
can  hybridize  with  species  Jatropha  integerrima,  Jatropha  canascens, 
and  Jatropha  gossypifolia  [21  ]. 

A  first-generation  linkage  map  was  constructed  using  a  map¬ 
ping  population  containing  two  families  consisting  of  96  indivi¬ 
duals.  The  families  were  produced  by  interspecies  (J.  curcas  x  J. 
integerrima)  cross  and  backcross  [28].  The  mapping  population 
was  genotyped  with  co-dominant  DNA  markers  (i.e.  SSRs  and  SNPs 
in  genes).  A  total  of  506  markers  (216  microsatellites  and  290  SNPs 
from  ESTs)  were  mapped  onto  11  linkage  groups.  The  length  of  the 
map  was  1440.9  cM,  with  an  average  marker  spacing  of  2.8  cM. 
Blasting  the  222  ESTs  containing  SSR  and  SNP  markers  mapped  on 
the  linkage  map  against  EST-databases  revealed  that  91.0%,  86.5% 
and  79.2%  of  J.  curcas  ESTs  were  homologous  to  counterparts  in 
castor  bean,  poplar  and  Arabidopsis  respectively.  192  orthologous 
markers  of  J.  curcas  were  mapped  to  the  assembled  whole  genome 
sequence  of  Arabidopsis  thaliana.  38  syntenic  blocks  were  detected 
with  the  comparative  mapping.  Small  linkage  blocks  were  well 
conserved,  but  often  shuffled.  The  linkage  map  and  the  data  of 
comparative  mapping  laid  the  foundation  for  QTL  mapping  of 
agronomic  traits,  MAS  and  cloning  genes  responsible  for  pheno¬ 
typic  variations.  An  additional  500  microsatellites  and  SNPs  have 
already  been  genotyped  and  will  be  mapped  to  the  existing 
linkage  map  of  J.  curcas.  Although  other  researcher  groups  claimed 
that  they  were  constructing  linkage  maps  for  J.  curcas,  so  far  no 
other  linkage  maps  have  been  reported.  The  slow  progress  of  linkage 
mapping  in  J.  curcas  may  be  due  to  the  low  genetic  variations  in  its 
natural  resources,  which  makes  the  generating  of  informative 
reference  families  for  mapping  difficult.  Therefore,  we  recommend 
using  informative  families  generated  by  crossing  different  species 
(e.g  .J.  curcas,  J.  integerrima,  J.  canascens  and  J.  gossypifolia)  for  linkage 
mapping. 

3.3.  Transcriptome 

The  transcriptome  is  the  complete  set  of  all  RNA  molecules, 
which  include  mRNA,  tRNA,  rRNA,  miRNA  and  other  non-coding 


long  or  short  RNA  in  one  or  a  population  of  cells.  The  transcrip¬ 
tome  can  vary  in  different  cells,  tissues  and  with  external  envir¬ 
onmental  conditions.  Studying  the  dynamics  and  regulation  of  the 
transcriptome  is  critically  important  in  the  understanding  of  the 
functions  of  a  genome  and  the  underlying  biological  processes. 
Several  projects  were  initiated  to  sequence  expressed  sequence 
tags  (ESTs)  of  J.  curcas.  In  Genbank,  there  are  over  100,000  EST 
sequences  deposited.  Costa  et  al.  sequenced  13,249  ESTs  from 
developing  and  germinating  seeds  [78],  They  identified  most 
known  genes  related  to  lipid  synthesis  and  degradation.  They  also 
detected  ESTs  coding  for  proteins  that  may  be  involved  in  the 
toxicity  of  seeds.  In  addition,  they  found  a  high  number  of  ESTs 
(800)  containing  transposable  element-related  sequences  in  the 
developing  seed  library  when  contrasted  with  those  found  in  the 
germinating  seed  library.  Chen  et  al.  established  three  cDNA 
libraries  with  mRNA  from  embryos  at  different  developmental 
stages  [79],  They  sequenced  ESTs  and  obtained  9844  unique 
sequences  of  which  1070  were  contigs  and  3595  were  singletons. 
Yadav  et  al.  constructed  a  normalized  and  full-length  enriched 
cDNA  library  from  developing  seeds  [69].  The  library  contained 
about  1  x  106  clones  with  an  average  insert  size  of  2.1  kb.  They 
sequenced  a  total  of  12,084  ESTs  using  Sanger  sequencing.  The 
average  length  of  the  high  quality  reads  was  576  bp.  After 
assembly,  2258  contigs  and  4751  singletons  were  obtained.  Anno¬ 
tation  of  these  7009  unisequences  by  BLASTX  revealed  that  most 
(6386/7009)  of  the  unisequences  could  be  annotated.  6233  of  the 
7009  unisequences  were  identified  to  be  potential  full-length 
genes.  Functional  classification  revealed  these  unisequences  cov¬ 
ered  a  broad  range  of  cellular,  molecular  and  biological  functions. 
King  et  al.  recently  conducted  high-throughput  sequencing  analy¬ 
sis  of  the  transcriptome  of  developing  seeds  using  454  sequencing 
[80].  Using  a  single  sequencing  run,  they  obtained  46  Mb  of  raw 
sequence  data  including  95,692  sequences.  After  assembly,  they 
yielded  12,419  contigs  and  17,333  singletons.  They  found  that  storage 
proteins  were  the  most  abundant  transcripts.  They  observed  that 
metallothioneins,  ribosomal  proteins,  and  late  embryogenesis  abun¬ 
dant  proteins  were  also  highly  expressed.  Curcin,  which  is  a  type-I 
ribosome-inactivating  protein,  was  also  abundant  accounting  for  0.7% 
of  the  transcriptome.  Purushothaman  and  Madasamy  conducted  454 
pyrosequencing  of  normalized  cDNAs  from  flowers,  mature  leaves, 
roots,  developing  seeds  and  embryos  of  J.  curcas  [81  ].  They  obtained 
381,957  high-quality  reads  from  383,918  raw  reads.  After  assembly, 
they  got  17,457  contigs  and  54,002  singletons.  The  assembled  tran¬ 
scripts  averaged  916  bp  in  length.  2589  of  these  transcripts  were  full- 
length.  The  authors  discovered  that  2320  transcripts  were  related  to 
major  biochemical  pathways  including  the  oil  biosynthesis  pathway. 
By  comparisons  with  other  publically  available  ESTs  of  jatropha,  14,327 
assembled  transcripts  were  novel.  Silva  et  al.  [76]  sequenced  ESTs  from 
polled  RNA  using  two  lanes  of  the  Illumina  sequencing  platform  [76]. 
They  obtained  11.8  giga  bases  of  high-quality  sequence.  Gu  et  al. 
constructed  several  cDNA  libraries  for  different  tissues  [82]  and 
sequenced  over  50,000  EST  clones  using  Sanger  sequencing.  A  number 
of  genes  related  to  the  synthesis  of  fatty  acids  were  obtained.  However, 
the  complete  data  set  has  not  been  published. 

The  large  number  of  transcripts  will  serve  as  an  invaluable 
genetic  resource  for  genetic  improvement  of  J.  curcas,  and  sequence 
information  of  genes  involved  in  the  biosynthesis  of  fatty  acids 
could  be  used  for  metabolic  engineering  of  J.  curcas  to  increase  oil 
content,  and  to  modify  oil  composition.  However,  a  complete 
reference  transcriptome  of  J.  curcas  is  still  unavailable.  Transcrip- 
tomics  studies  require  a  high  quality,  comprehensive  reference 
transcriptome  that  includes  all  transcripts,  coding  and  noncoding, 
large  and  small  RNA  [83],  therefore,  it  is  essential  to  conduct  an 
assembly  of  all  available  EST  data  using  sophisticated  software  (e.g. 
CLC  Genomics  workbench)  to  get  a  well  assembled  and  annotated 
comprehensive  transcriptome  of  J.  curcas. 
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Fig.  3.  Application  areas  of  genomic  resources  in  improving  J.  curcas.  In  short  term  (1-2  years),  genomic  resources  can  be  used  in  analyzing  genetic  diversity,  population 
relationships  and  parentage,  as  well  as  barcoding  elite  trees.  In  middle  term  (2-4)  years,  genomic  resources  can  be  applied  to  linkage  and  QJL  mapping  and  marker-assisted 
selection  (MAS)  and  producing  transgenic  trees  for  improving  yield  and  quality  of  biodiesel.  In  long  term  (  >  4  years),  genomic  resources  can  be  utilized  to  identify  a  large 
number  of  SNPs  on  the  whole  genome,  to  conduct  whole  genome  association  studies  (GWAS)  and  genomic  selection  (GS)  for  accelerating  genetic  improvement  of  J.  curcas. 


3.4.  miRNAs 

miRNAs,  which  are  small  noncoding  RNAs,  play  crucial  regula¬ 
tory  roles  in  gene  silencing  by  targeting  mRNAs  g  [84].  Due  to  of 
the  ability  of  miRNAs  to  inactivate  either  specific  genes  or  entire 
gene  families,  artificial  miRNAs  can  function  as  dominant  suppres¬ 
sors  of  gene  activity  when  they  are  brought  into  a  plant. 
Consequently,  miRNA-based  manipulations  of  gene  functions 
have  emerged  as  promising  new  approaches  for  genetic  improve¬ 
ment  of  crops.  This  includes  the  genetic  modification  of  agronomic 
traits  and  the  development  of  new  breeding  strategies  [85], 
This  strategy  has  been  used  in  genetic  improvement  of  fruits 
[85,86].  In  J.  curcas, 52  putative  miRNAs  were  identified  by 
sequencing  2000  clones  from  a  small  RNA  library  of  leaves  and 
seeds  [87].  Among  them,  six  were  identical  to  known  miRNAs 
and  46  were  novel.  Quantitative  real-time  PCR  revealed  differen¬ 
tial  expression  patterns  of  15  miRNAs  in  root,  stem,  leaf,  fruit 
and  seed.  Ten  miRNAs  were  highly  expressed  in  fruits  and 
seeds,  suggesting  that  they  are  involved  in  seed  development 
or  fatty  acids  synthesis  in  seeds.  In  addition,  28  targets  of  the 
isolated  miRNAs  were  predicted  by  using  20,000  EST  sequences 
from  a  cDNA  library  [82].  These  miRNA  target  genes  encode 
a  broad  range  of  proteins.  Sixteen  targets  were  associated  with 
genes  belonging  to  the  three  major  gene  ontology  categories 
of  biological  process,  cellular  component,  and  molecular  function. 
Four  targets  were  identified  for  the  miRNA  JcumiR004.  By 
silencing  JcumiR004  primary  miRNA,  expressions  of  the  four 
target  genes  were  up-regulated  and  oil  composition  was  modu¬ 
lated  significantly,  indicating  diverse  functions  of  JcumiR004. 
Vishwakarma  et  al.  recently  identified  22  miRNAs  from  ESTs  and 
genome  survey  sequences  [88].  However,  the  number  of  miRNAs 
identified  in  J.  curcas  is  still  limited.  Further  identification  of 
additional  novel  miRNAs  is  essential.  It  can  be  expected  that  the 
applications  of  miRNAs  in  genetic  improvement  in  J.  curcas  will 
emerge  soon. 


3.5.  Genome  sequence 

The  genome  of  J.  curcas  has  been  sequenced  by  several  research 
institutes  (e.g.  Synthetic  Genomics  USA;  ACGT,  Malaysia;  Temasek 
Life  Sciences  Laboratory,  Singapore;  Kazusa  DNA  Research  Insti¬ 
tute,  Japan)  and  companies  (e.g.  Life  Technologies  and  SG  Biofuels, 
USA).  However,  only  the  results  generated  by  Japanese  scientists 
have  been  published  [75,89].  By  integrating  de  novo  assembly  of  a 
total  of  537  million  paired-end  reads  generated  from  the  lllumina 
sequencing  platform  into  the  previous  genome  assembly  [75],  a 
new  assembly  was  reported  recently  [89].  The  newly  assembled 
genome  was  297.7  Mb  consisting  of  39,277  contigs.  The  average 
and  N50  lengths  of  the  generated  contigs  were  7579  and 
15,950  bp,  respectively  [75,89].  In  addition,  the  authors  collected 
all  available  transcriptome  data  from  the  public  databases  and 
assembled  them  into  19,454  tentative  consensus  sequences.  By 
comparing  these  tentative  consensus  sequences  of  transcripts,  and 
updating  genome  sequences,  the  authors  predicted  a  total  of 
30,203  complete  and  partial  structures  of  protein-encoding  genes. 
The  number  of  genes  with  complete  structures  was  substantially 
increased  in  comparison  to  the  previous  genome  annotation.  The 
authors  further  analyzed  the  number  and  features  of  the  tandemly 
arrayed  genes,  syntenic  relations  between  J.  curcas  and  other  plant 
genomes,  and  structural  features  of  transposable  elements.  The 
detailed  information  on  the  updated  J.  curcas  genome  is  available 
at  http://www.kazusa.or.jp/jatropha/.  It  is  expected  that  the  draft 
genomic  sequence  and  accompanying  information  will  serve  as 
valuable  resources  for  speeding  up  fundamental  and  applied 
research  of  J.  curcas.  However,  it  is  of  note  that  the  assembled 
genome  is  still  a  draft  genome  sequence.  Further  efforts  on  filling 
gaps,  linking  scaffolds  to  linkage  groups/chromosomes,  and  iden¬ 
tifying  DNA  markers  by  resequencing  additional  individuals  are 
essential. 

Genomic  tools  by  themselves  will  not  increase  the  productivity 
and  sustainability  of  the  production  of  oil  from  J.  curcas.  However, 
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they  will  provide  more  approaches  to  accelerate  genetic  improve¬ 
ment  for  increasing  jatropha  oil  yield  and  quality.  What  is  needed 
now  is  to  use  the  genomic  tools  to  facilitate  every  step  of  the 
breeding  of  J.  curcas  to  increase  the  yield  and  quality  of  oil. 

4.  Applications  of  genomic  resources  in  improving  biodiesel 
production 

Genomic  resources  summarized  above  have  been  used  or  are 
being  applied  in  accelerating  both  basic  and  applied  research  for 
genetic  improvement  of  jatropha  for  biodiesel  production.  Appli¬ 
cation  areas  of  genomic  resources  in  J.  curcas  are  summarized  in 
Fig.  3.  Some  selected  areas  demonstrating  applications  of  genomic 
resources  are  given  below. 

4.1.  Accessing  genetic  variations 

Genetic  variations  in  natural  populations  are  the  sources  of  genetic 
improvement.  Therefore,  information  about  genetic  variations  is 
critically  important  in  any  breeding  program.  Genetic  variations  in 
natural  and  cultured  populations  around  the  world  have  been  studied 
by  using  RAPD,  1SSR,  and  AFLP  [16,39,46,55,56,62-69],  The  general 
finding  is  that  the  genetic  variations  in  J.  curcas  in  the  varieties  from 
Asia,  Africa  and  Brazil  are  lower  whereas  the  genetic  variations  are 
slightly  higher  in  varieties  from  Mexico  [16,63,90,91],  thus  supplying  a 


scientific  base  for  selective  and  hybrid  breeding.  The  studies  on  genetic 
diversity  using  microsatellites  obtained  very  similar  results  as  those 
obtained  using  RAPD,  ISSR  and  AFLP  assays.  Our  lab  has  studied 
genetic  diversity  of  278  individuals  of  J.  curcas  collected  from  four 
continents  using  an  automated  DNA  sequencer  AB1  3730  x  1  (Applied 
Biosystems).  Surprisingly,  we  found  that  there  was  no  genetic  variation 
(see  example  of  marker  genotyping  data  in  Fig.  4)  at  all  the  29 
microsatellite  loci  (Unpublished  data).  In  addition,  our  lab  has  con¬ 
structed  the  first  linkage  map  of  jatropha  using  microsatellites  and 
SNPs  developed  by  us  (see  details  below).  216  microsatellites  and  290 
SNPs  mapped  in  the  linkage  map  [28],  were  all  homozygote  in  the  J 
curcas  mother,  but  all  heterozygous  in  the  J.  integerrima  x  curcas  hybrid 
father.  Recent  SNP  analysis  showed  that  the  genetic  variations  in  J. 
curcas  accessions  were  very  low  [77].  All  these  data  suggest  that  the 
variations  at  DNA  level  are  extremely  low  in  J.  curcas. 

4.2.  Molecular  barcoding  of  elite  trees 

DNA  profile  analysis  can  be  used  to  distinctly  identify  individual 
animals.  Once  the  unique  DNA  profile  is  obtained,  the  information 
can  be  used  to  differentiate  individuals  [92].  In  comparison  to 
physical  tagging,  the  genetic  identity  using  DNA  profiling  is  more 
reliable,  and  cannot  be  changed  and  modified  by  humans.  In  tree 
breeding,  an  important  issue  is  to  protect  the  results  of  the  lengthy 
breeding  programs,  namely  the  selected  elites.  Once  elite  trees  are 
selected,  it  can  be  easily  multiplied  by  tissue  culture.  Physical  tags 


Jatr0102  Jatr0079  Jatr0105  Jcuint0215 


Fig.  4.  Genotypes  at  for  four  SSR  loci  (Jatr0102,  Jatr0079,  Jatr0105,  and  Jcuint0215)  showing  that  all  individuals  (1-4)  of  J.  curcas  are  homozygous  and  share  the  same 
genotype  at  each  locus. 
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can  certainly  be  used  to  label  elite  trees.  However,  it  can  be  easily 
lost  and  changed.  The  ideal  way  is  to  tag  the  elites  using  their  own 
DNA  profiling.  Usually,  the  possibility  of  two  individuals  sharing  the 
same  genotypes  was  10-8  when  using  8-10  microsatellite  markers 
[93,94].  In  J.  curcas,  a  similar  genetic  barcoding  system  using  11 
microsatellites  located  on  11  linkage  groups  [28]  has  been  devel¬ 
oped  by  our  group,  and  is  being  used  in  identifying  and  protecting 
elite  trees  (Unpublished  data).  However,  due  to  low  microsatellite 
variation  in  pure  J.  curcas,  the  identification  power  could  be  also 
low.  In  the  hybrid  varieties  generated  by  crossing  J.  curcas  with 
J.  integerrima,  and  backcrossing,  the  power  of  identification  of  the 
11  markers  could  be  higher.  However,  to  save  time  and  cost,  it  is 
essential  to  develop  a  multiplex  PCR  to  amplify  all  markers  in  one 
PCR  [94], 


4.3.  Identifying  candidate  genes  for  important  traits 

Besides  the  identification  of  DNA  markers  [69],  genomic 
resources  (e.g.  ESTs)  can  be  used  for  transcript  profiling  to  identify 
the  candidate  genes  for  traits  of  interest,  as  well  as  development  of 
microarray  to  study  differential  expressions  of  different  genes  in 
different  tissues  at  different  developing  stages. 

Genes  related  to  fatty  acids  synthesis,  stress  resistance  and 
other  important  traits  have  been  isolated  from  cDNA  libraries  and 
RNA-seq  using  NGS.  For  example,  Tan  et  al.  isolated  the  JcERF  gene, 
which  is  an  ERF  subfamily  member.  They  found  that  the  full- 
length  JcERF  functioned  effectively  as  a  trans-activator  in  the  yeast 
one-hybrid  assay  [95].  In  transgenic  Arabidopsis,  overexpression  of 
JcERF  enhanced  the  salt  and  freezing  tolerance,  whereas  the  seed 
germination  was  not  affected.  Their  results  suggest  that  JcERF 
functioned  as  a  novel  transcription  factor.  To  identify  novel  genes 
expressed  during  stress  in  J.  curcas,  Eswaran  et  al.  conducted  a 
screen  of  a  cDNA  library  constructed  from  salt-stressed  roots  and 
obtained  32  full-length  genes  that  can  confer  abiotic  stress 
tolerance  [96],  These  genes  could  be  over-expressed  to  generate 
and  evaluate  transgenic  plants  for  stress  tolerance  as  well  as  be 
used  as  markers  for  breeding  salt  stress  tolerance.  Jang  et  al. 
obtained  the  JcDofl  gene  from  seeding,  and  confirmed  that  this 
gene  was  located  in  the  onion  epidermal  cell  nucleus  [97].  This 
gene  exhibited  DNA-binding  and  transcriptional  activation  activities  in 
yeast.  The  JcDofl  expression  was  characterized  by  a  circadian-clock 
oscillation  under  long  day,  short  day  and  continuous  light  conditions, 
suggesting  that  JcDofl  was  a  circadian  clock-Dof  transcription  factor 
gene  responding  to  light  signals.  A  putative  flowering-time-related  Dof 
transcription  factor  gene,  JcDof3  was  cloned  and  characterized  by  Jang 
et  al.  Recently,  Gu  et  al.  cloned  and  characterized  genes  accA,  accBl, 
accC  and  accD  that  encode  the  subunits  of  heteromeric  ACCase  [98]. 
They  found  that  the  accA,  accBl,  accC  and  accD  genes  were  temporally 
and  spatially  expressed  in  the  leaves  and  endosperm.  More  recently, 
68  fatty  acids  and  lipid  biosynthetic  genes  have  been  identified  from  a 
normalized  cDNA  library  constructed  using  cDNA  from  developing 
endosperm.  Gu  et  al.  investigated  their  expression  at  different  devel¬ 
oping  stages  of  endosperm  [82].  They  found  that  the  expression  of 
the  majority  of  fatty  acid  and  lipid  biosynthetic  genes  was  highly 
consistent  with  the  development  of  oil  bodies  and  endosperm  in 
seeds,  while  the  genes  encoding  enzymes  with  similar  function  may 
be  differentially  expressed  during  endosperm  development. 

In  addition  to  the  isolation  of  candidate  genes  related  to 
important  traits,  a  method  for  rapid  analysis  of  J.  curcas  gene 
functions  by  virus-induced  gene  silencing  has  been  developed 
recently  [99].  The  method  produced  robust  and  reliable  gene 
silencing  in  plants  agroinoculated  with  recombinant  TRV  harbor¬ 
ing  jatropha  gene  sequences.  The  virus  induced  gene  silencing 
(VIGS)  method  can  be  used  for  high-throughput  screening  of 
jatropha  genes  and  analysis  of  their  functions. 


4.4.  QTL  mapping  for  oil  yield  and  quality 

Most  economically  important  traits  such  as  plant  height,  seed 
yield,  and  oil  content  in  seeds  are  quantitative  in  nature,  and  are 
controlled  by  many  genes,  environmental  factors  and  their  inter¬ 
actions.  In  most  cases,  the  underlying  single  genes  have  small 
effects.  Quantitative  trait  loci  (QTL)  are  gene  clusters  or  chromo¬ 
somal  regions  influencing  the  expression  of  a  quantitative  trait 
[100].  QTL  mapping  can  facilitate  the  understanding  of  the  number 
and  effects  of  genes  that  determine  the  expression  of  a  trait  and 
assist  in  selective  breeding  to  accelerate  genetic  improvement 
[24,51  ].  In  genetic  improvement  of  jatropha  seed  yield,  oil  content, 
oil  composition,  tree  height,  branch  number,  disease  resistance 
and  pest  resistance  are  the  important  traits.  QTL  mapping  has  been 
conducted  for  some  of  these  traits. 

Using  105  microsatellites  almost  evenly  covering  11  linkage 
groups  (LGs)  of  the  linkage  map  of  jatropha,  and  a  backcrossing 
population  with  296  jatropha  trees,  a  total  of  28  QTL  for  tree 
growth  and  seed  traits  were  mapped  on  the  whole  genome  [25], 
Two  QTLs  qTSW-5  and  qTSW-7  for  seed  yield  were  located  on  LGs 
5  and  7  respectively.  In  these  two  LGs,  two  QTL  clusters  harboring 
five  and  four  QTL  respectively  controlling  yield  related  traits  were 
detected.  These  two  QTL  clusters  played  pleiotropic  roles  in 
regulating  seed  yield  and  plant  growth.  Positive  additive  effects 
of  the  two  QTL  indicated  higher  values  for  the  traits  conferred  by 
the  alleles  from  J.  curcas,  while  negative  additive  effects  of  the 
five  QTL  on  LG  6,  controlling  plant  height,  branch  number,  female 
flower  number  and  fruit  number  respectively,  demonstrated 
higher  values  conferred  by  the  alleles  from  J.  integerrima.  There¬ 
fore  favored  alleles  from  both  the  parents  could  be  integrated  into 
an  elite  jatropha  plant  by  further  backcrossing  and  MAS. 

The  major  fatty  acids  in  seed  oil  of  jatropha  are  palmitic  acid 
(C16:0),  stearic  acid  (C18:0),  oleic  acid  (C18: 1 )  and  linoleic  acid 
(C18:2).  High  oleic  acid  and  total  oil  content  are  desirable  for 
jatropha  breeding.  Composite  interval  mapping  detected  18  QTL 
for  oil  traits  on  the  genome  of  jatropha  [29].  A  highly  significant 
QTL  qC18:l-l  was  detected  at  one  end  of  LG  1  explaining  36.0%  of 
the  phenotypic  variance  (PVE).  The  QTL  qC18:l-l  overlapped  with 
qC18:2-l,  influencing  the  contents  of  oleic  acid  and  linoleic  acids. 
Among  the  significant  QTL  controlling  total  oil  content,  qOHC-4 
was  mapped  on  LG  4  with  PVE  of  11.1%.  Meanwhile,  oleosins  are 
the  major  composition  in  oil  body  affecting  oil  traits.  Three  oleosin 
genes  Olel,  Olell  and  Olelll,  were  mapped  onto  the  linkage  map  of 
jatropha  using  SNPs  in  these  genes.  Olel  and  Olelll  were  mapped 
on  LG  5,  close  to  QTLs  controlling  oleic  acid  and  stearic  acid.  QTL 
(eQTL)  for  the  expressions  of  the  three  genes  were  mapped  on  LGs 
5,  6  and  8  respectively.  The  eQTL  for  Olelll,  qOlelll-5,  was  located  on 
LG  5  and  overlapped  with  QTL  controlling  stearic  acid  and  oleic 
acid,  implying  a  cis-  or  trans-element  for  the  Olelll  affecting  fatty 
acid  compositions. 

While  QTL  for  some  important  traits  have  been  mapped,  many 
important  traits  such  as  disease  resistance  and  pest  resistance, 
which  are  critically  important  for  the  sustainable  development  of 
the  jatropha  industry  for  biodiesel  production,  have  not  been 
studied  yet.  To  make  MAS  possible,  fine  mapping  and  confirmation 
of  mapped  QTL  are  essential. 

4.5.  Marker-assisted  selection  and  introgression 

Introgression  of  recessive  genes  and  pyramiding  of  multiple 
genes  are  very  difficult  using  conventional  breeding  methods  [53], 
However,  MAS  is  useful  to  overcome  such  problems.  Several  genes 
can  be  pyramided  either  for  the  same  trait  or  for  different  traits 
along  with  faster  recurrent  parent  genome  recovery  through 
intense  background  selection.  In  addition,  MAS  can  be  used  to 
introgress  a  lot  of  recessive  genes  in  less  time  than  conventional 
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breeding  [53].  In  the  case  of  J.  curcas,  some  studies  have  already 
been  initiated  to  use  the  molecular  markers  in  breeding  programs. 
Liu  et  al.  identified  QTL  for  high  oleic  acid  and  total  oil  content,  and 
recommended  integrating  the  QTL  for  selection  of  elite  trees  [29]. 
More  recently,  Sun  et  al.  recommended  using  pleiotropic  QTLs 
regulating  plant  growth  and  seed  yield  [29].  However,  some  time 
is  still  needed  before  the  outcome  of  MAS  can  be  seen. 

Apart  from  the  introgression  of  genes/QTLs  linked  to  traits  from 
the  elite  cultivars  in  the  variety  of  interest,  molecular  markers  are 
helpful  for  introgression  of  genes  from  wild  species,  which  are 
generally  inferior  in  agronomic  performance,  into  elite  cultivars.  In 
a  previous  QTL  analysis,  a  QTL  linked  for  high  branch  number  were 
detected  in  J.  integerrima  [29].  The  authors  suggested  that  using 
markers  linked  to  these  QTL  can  transfer  the  useful  genes  from  J. 
integerrima  to  J.  curcas,  while  stringent  background  selection  is 
necessary  to  limit  linkage  drag  by  tracking  the  presence  of  unwanted 
genomic  segments  of  J.  integerrima.  However,  the  results  of  these 
efforts  have  not  been  reported,  and  are  expected  to  be  known  soon. 


5.  Bioengineering  for  improving  jatropha  for  the  production 
of  biodiesel 

Genome  and  transcriptome  sequences  have  been  used  in 
identifying  genes  which  play  an  important  role  in  fatty  acid 
synthesis,  and  their  promoter  regions.  The  information  of  the 
coding  and  regulatory  regions  is  critically  important  in  modifying 
fatty  acid  synthesis  pathways  using  transgenic  technologies  to 
improve  the  quality  and  quantity  of  biodiesel  production  from  J. 
curcas.  Transformation  approaches  for  generating  transgenic  jatro¬ 
pha  have  already  been  optimized  [101-103].  Researches  on 
modifying  fatty  acid  synthesis  pathways  through  transgenic  tech¬ 
nologies  have  been  initiated  in  several  countries,  such  as  Singa¬ 
pore,  China,  Japan  and  India  leading  to  improve  the  quality  and 
quantity  of  oil  from  J.  curcas.  However,  only  a  few  data  have  been 
published. 

The  ignition  quality,  heat  of  combustion  and  oxidative  stability 
of  oil  is  affected  by  its  fatty  acids  profile.  An  ideal  biodiesel 
contains  high  percentage  of  monounsaturated  fatty  acids  and  less 
polyunsaturated  acids.  Oil  form  J.  curcas  contains  30-50%  of 
polyunsaturated  fatty  acids  (mainly  linoleic  acid),  which  nega¬ 
tively  impacts  the  oxidative  stability  and  causes  high  rate  of 
nitrogen  oxides  emission.  In  Singapore,  three  types  of  the  enzyme 
1  -acyl-2-oleoyl-sn-glycero-3-phosphocholine  delta  12-desaturase 
(FAD2),  which  are  the  key  enzymes  responsible  for  the  production 
of  linoleic  acid  in  plants  were  identified  through  a  whole  genome 
approach.  Using  the  RNA  interference  technology  [102],  the  FAD2- 
1  was  down-regulated  in  a  seed-specific  manner.  The  transgenic 
JcFAD2-l  plants  increased  oleic  acid  (  >  78%)  and  there  was  a 
corresponding  reduction  in  polyunsaturated  fatty  acids  ( <  3%)  in 
its  seed  oil,  thus  enhancing  the  quality  of  its  oil.  The  presence  of 
high  seed  oleic  acid  did  not  have  a  negative  impact  on  other 
jatropha  agronomic  traits  (e.g.  oil  yield).  This  is  probably  the 
world's  first  genetically  modified  jatropha  plant  for  increasing 
the  quality  of  biodiesel  from  J.  curcas.  Field  trials  and  commercia¬ 
lization  of  the  transgenic  trees  started  in  2013. 

Yin  et  al.  from  the  Temasek  Life  Sciences  Laboratory,  Singapore 
filed  a  patent  (US  2012/0073018  Al)  on  the  isolation  of  J.  curcas 
curcin  genes,  tissue-specific  promoters  and  the  production  of 
curcin-deficient  jatropha  plants.  By  using  RNAi  transgenic  tech¬ 
nology,  curcin  gene  expression  was  suppressed,  thus  substantially 
reducing  the  amount  of  curcin  protein  in  seeds  and  leaves  that  is 
harmful  to  human  health.  These  transgenic  plants  reduced  toxic 
effect  of  J.  curcas  on  people  working  on  J.  curcas.  However,  it  is  not 
known  whether  the  suppression  of  the  expressions  of  the  curcin 


genes  will  influence  the  performances  of  other  economic  traits  (e. 
g.  seed  yield,  oil  quality,  resistance  to  pests  and  diseases). 

In  Japan,  in  an  attempt  to  improve  drought  tolerance  for  sustain¬ 
able  production  of  biodiesel  from  J.  curcas,  three  kinds  of  transgenic 
jatropha  plants  were  generated  [104].  The  first  one  is  the  transgenic 
plant  in  which  the  PPAT  gene,  which  encodes  an  enzyme  that  catalyzes 
the  CoA  biosynthetic  pathway,  was  over-expressed.  The  second  over¬ 
expressed  the  NF-YB  gene,  which  encodes  a  subunit  of  the  NF-Y 
transcription  factor;  whereas  the  third  the  GSMT  and  DMT  genes, 
which  encode  enzymes  that  catalyze  production  of  glycine  betaine, 
were  up-regulated.  Preliminary  results  suggest  that  the  expressions  of 
the  introduced  GSMT  and  DMT  genes  significantly  enhance  glycine 
betaine  synthesis  in  jatropha,  and  thus  should  effectively  improve  the 
drought  tolerance  of  jatropha. 

In  China,  transgenic  plants  with  <u-3  fatty  acid  desaturase  FAD8, 
that  catalyzes  the  dienoic  acid  rapidly  to  produce  trienoic  acid  in 
cold  conditions,  have  been  generated  for  improving  the  cold 
resistance  in  seedlings  of  J.  curcas  [105].  Some  biological  para¬ 
meters  related  to  cold  tolerance  in  the  transgenic  plants  suggest 
that  the  transgenic  trees  are  tolerant  to  cold.  However,  no  field  test 
data  have  been  released. 

Although  the  experimental  data  of  transgenic  jatropha  for 
increasing  oil  quality,  cold  and  draught  resistance  and  reducing 
toxicity  are  very  promising,  so  far,  it  seems  that  the  transgenic 
jatropha  trees  have  not  been  gone  through  extensive  field  tests. 
The  general  productivity  of  these  transgenic  plants  for  producing 
oil  under  commercial  plantation  conditions  is  unknown.  Besides 
the  improvement  of  the  targeted  traits,  oil  yield  must  be  main¬ 
tained  or  improved  in  the  transgenic  plants. 


6.  Future  directions  in  increasing  biodiesel  yield  and  quality 
from  J.  curcas 

The  current  public  and  private  interest  in  jatropha  has  trig¬ 
gered  large-scale  investments  and  expansion  of  its  plantations. 
Genetic  improvement  for  increasing  oil  yield  using  conventional 
breeding  approaches  has  been  initiated  in  some  countries.  How¬ 
ever,  the  current  oil  yield  is  still  too  low  to  make  the  jatropha 
plantation  profitable  and  sustainable.  Genomic  resources  have 
been  developed  for  speeding  up  the  genetic  improvement  of  J. 
curcas  and  some  have  already  been  used  in  the  evaluation  of  the 
genetic  diversity  in  natural  and  cultured  population,  in  construct¬ 
ing  linkage  map  and  in  mapping  QTL  for  some  important  traits. 
However,  jatropha  genomics  has  lagged  far  behind  that  of  model 
and  other  agricultural  systems.  It  is  essential  to  develop  a  high 
density  linkage  map  to  find  DNA  markers  associated  with  high  oil 
yield.  With  availability  of  the  draft  genome  sequences  and  tran¬ 
scriptome,  this  task  should  not  be  difficult.  Although  QTL  analyses 
have  been  carried  out  for  identifying  DNA  markers  associated  with 
oil  yield  and  quality  in  populations  generated  by  interspecies 
hybridization,  most  QTL  were  only  mapped  in  large  marker  space. 
Only  these  QTL  with  moderate-to  large  effects  were  detected  with 
the  current  experimental  design.  Further  confirmation  and  fine 
mapping  of  identified  QTL  for  oil  yield  and  quality  in  different 
populations  are  essential  for  future  MAS.  No  QTL  for  oil  yield  and 
quality  in  the  pure  breed  J.  curcas  has  been  reported,  probably 
because  genetic  variations  in  J.  curcas  are  too  narrow. 

Because  the  QTL  mapping  only  identifies  QTL  with  moderate  to 
large  effects,  QTL  with  small  effects  are  missed.  Due  to  the  rapid 
development  of  high-throughput  and  cost-effective  genotyping  of 
a  large  number  of  DNA  markers  (e.g.  SNP),  researchers  are  starting 
association  mapping  based  on  linkage  disequilibrium  (LD)  using  a 
large  number  of  DNA  markers  covering  the  whole  genome  in 
model  organisms,  humans,  livestock  and  agronomic  plant  species. 
The  approach  to  associate  many  DNA  variations  (e.g.  >  500,000  SNPs) 
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in  the  whole  genome  with  traits  from  many  individuals  (e.g.  >  1000), 
is  called  genome  wide  association  studies  (GWAS)  [106,107],  This 
technique  has  discovered  the  associations  of  particular  genes  with  a 
number  of  common  diseases  in  humans  [106,107],  Because  GWAS  are 
based  on  LD,  they  are  able  to  detect  very  small  effects  of  marker-trait 
associations.  For  GWAS,  usually  natural  populations  are  used.  If  alleles 
at  markers  are  significantly  associated  with  superior  phenotypes, 
these  markers  can  be  used  for  selection  across  breeding  populations. 
The  marker-assisted  selection  using  DNA  markers  associated  with 
traits  of  interest  which  are  identified  in  GWAS,  is  called  genomic 
selection  (GS).  A  previous  study  showed  that  breeding  values  can  be 
predicted  with  high  accuracy  using  SNPs  along  the  whole  genomes 
[108].  In  J.  curcas,  GWAS  could  be  even  more  attractive,  as  the  genome 
of  J.  curcas  is  very  small  (ca.  400  Mb).  For  GWAS,  the  cost  for  geno- 
typing  could  be  much  lower  in  J.  curcas  than  species  with  bigger 
genomes.  In  addition,  GS  in  jatropha  would  have  other  advantages: 
large  training  populations  can  be  easily  obtained.  The  extent  of  LD 
could  be  very  high  in  superior  trees  with  a  small  effective  population 
size  frequently  used  in  current  breeding  programs.  The  recent  devel¬ 
opment  of  genotyping  by  sequencing  (e.g.  RAD-seq)  [109]  has 
drastically  reduced  the  cost  of  genotyping  SNPs,  which  makes  GWAS 
and  GS  feasible.  However,  GWAS  requires  a  well-assembled  references 
genome.  Therefore,  research  priority  should  be  put  on  acquiring  a 
well-assembled  and  annotated  reference  genome  sequence.  This  could 
be  accomplished  in  the  near  future,  as  the  Japanese  scientists  have 
already  assembled  a  draft  genome  of  J.  curcas.  It  is  expected  that  in  the 
next  few  years,  the  cost  of  genotyping  by  using  NGS  will  be  reduced 
substantially  (at  least  10  folds).  In  the  near  future,  GWAS  promises  to 
yield  numerous  SNP  markers  that  could  be  used  in  GS  for  early 
selection  of  superior  alleles  associated  with  a  wide  range  of  traits 
(certainly  also  oil  yield  and  quality)  in  J.  curcas.  As  the  efficiency  of 
DNA  sequencing,  SNP  discovery,  genotyping  and  other  molecular 
procedures  improve  and  experimental  costs  decrease,  the  opportu¬ 
nities  to  incorporate  NGS  and  GS  into  breeding  programs  for  improv¬ 
ing  jatropha  and  biodiesel  will  substantially  increase. 

Another  application  of  NGS  is  in  studies  on  expressions  of  all 
genes,  for  which  NGS,  in  combination  of  sophisticated  bioinfor- 
matic  tools,  will  surely  replace  microarray  experiments  soon.  In 
comparison  to  other  gene  expression  approaches  (e.g.  microarray 
and  real-time  PCR),  NGS  technologies  can  provide  more  compre¬ 
hensive  insights  into  the  spatial  and  temporal  control  of  gene 
expressions.  Therefore,  it  can  be  anticipated  that  NGS  will  facilitate 
GS.  NGS  can  also  speed  up  the  development  of  transgenic 
technologies  for  improving  J.  curcas  and  biodiesel  because  it 
becomes  easier  to  modify  genes  and  their  regulatory  elements 
with  the  increasing  availability  of  genomic  resources.  Although, 
analysis  of  large  sets  of  NGS  data  is  still  a  very  difficult  task 
presently,  significant  progress  is  being  made  in  improving  existing 
bioinformatic  and  statistical  tools,  and  in  developing  new  algor¬ 
isms  and  approaches  for  this  task.  We  strongly  believe  an 
exponential  increase  in  the  use  of  NGS  technologies  for  speeding 
up  the  improvement  of  J.  curcas  and  biodiesel.  The  results  of 
these  efforts  will  have  a  profound  impact  on  the  industry  of 
J.  curcas. 


7.  Conclusion 

Although  J.  curcas  is  a  promising  candidate  for  producing 
biodiesel,  the  genetic  improvement  of  J.  curcas  for  producing 
biodiesel  through  conventional  breeding  is  too  slow  to  make  the 
production  of  biodiesel  from  J.  curcas  sustainable.  Genomic 
resources  hold  the  great  promise  in  accelerating  genetic  improve¬ 
ment  for  sustainable  production  of  biodiesel.  Although  some 
genomic  resources  have  been  developed  and  applied  for  genetic 
improvement  of  J.  curcas,  jatropha  genomics  lagged  far  behind  that 


of  model  and  other  agricultural  systems.  Further  development  and 
application  of  genomic  resources  (e.g.  a  well  assembled  genome 
sequence  and  a  large  number  of  SNPs)  are  essential  for  rapid 
increasing  oil  yield  and  quality  from  J.  curcas.  NGS  technologies 
will  speed  up  the  development  of  genomic  resources,  and  accord¬ 
ingly  will  accelerate  the  genetic  improvement  of  J.  curcas  for 
sustainable  production  of  biodiesel.  The  future  of  J.  curcas  as  a 
plant  species  for  producing  biofuel  is  bright. 
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